Distance Sensitive Bloom Filters Without False Negatives

نویسندگان

  • Mayank Goswami
  • Rasmus Pagh
  • Francesco Silvestri
  • Johan Sivertsen
چکیده

A Bloom filter is a widely used data-structure for representing a set S and answering queries of the form “Is x in S?”. By allowing some false positive answers (saying ‘yes’ when the answer is in fact ‘no’) Bloom filters use space significantly below what is required for storing S. In the distance sensitive setting we work with a set S of (Hamming) vectors and seek a data structure that offers of similar trade-off, but answers queries of the form “Is x close to an element of S?” (in Hamming distance). Previous work on distance sensitive Bloom filters have accepted false positive and false negative answers. Absence of false negatives is of critical importance in many applications of Bloom filters, so it is natural to ask if this can be achieved also in the distance sensitive setting. Our main contribution is upper and lower bounds (that are tight in several cases) for space usage in the distance sensitive setting where false negatives are not allowed. ∗The research leading to these results has received funding from the European Research Council under the EU 7th Framework Programme, ERC grant agreement no. 614331.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Robust Bloom Filter

A Bloom filter is a space-efficient randomized data structure representing a set for membership queries. Faults in Bloom filters, however, cannot guarantee no false negatives. In this paper, we present a simple redundancy scheme for detecting false negatives and tolerating false positives induced by faults in Bloom filters during normal operation. A spare hashing unit with a simple coding techn...

متن کامل

Retouched Bloom Filters: Allowing Networked Applications to Flexibly Trade Off False Positives Against False Negatives

Where distributed agents must share voluminous set membership information, Bloom filters provide a compact, though lossy, way for them to do so. Numerous recent networking papers have examined the trade-offs between the bandwidth consumed by the transmission of Bloom filters, and the error rate, which takes the form of false positives, and which rises the more the filters are compressed. In thi...

متن کامل

Optimization of Compact Set Membership Representation for Distributed Computing March 2005

Bloom filters are widely used although not always optimal. Notably in application areas where false negatives are bearable, other techniques can clearly be better. This master thesis shows that at least for a specific area in the parameter space Bloom filters are significantly outperformed even by trivial methods. We provide an analysis, show that many application areas where Bloom filters are ...

متن کامل

Improving retouched Bloom filter for trading off selected false positives against false negatives

Where distributed agents must share voluminous set membership information, Bloom filters provide a compact, though lossy, way for them to do so. Numerous recent networking papers have examined the trade-offs between the bandwidth consumed by the transmission of Bloom filters, and the error rate, which takes the form of false positives. This paper is about the retouched Bloom filter (RBF). An RB...

متن کامل

Reducing False Positives of a Bloom Filter using Cross-Checking Bloom Filters

A Bloom filter is a compact data structure that supports membership queries on a set, allowing false positives. The simplicity and the excellent performance of a Bloom filter make it a standard data structure of great use in many network applications. In reducing the false positive rate of a Bloom filter, it is well known that the size of a Bloom filter and accordingly the number of hash indice...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017